Protocol for Coordinated Checkpointing using Smart Interval with Dual Coordinator

نویسندگان

  • Manoj Kumar Niranjan
  • Mahesh Motwani
  • D. Manivannan
  • R. H. B. Netzer
  • Guohong Cao
  • Mukesh Singhal
  • E. N. Elnozahy
  • D. B. Johnson
  • Lorenzo Alvisi
  • Yi-Min Wang
  • David B. Johnson
  • M. M. Naidu
  • Sarmistha Neogy
  • Anupam Sinha
  • Pradip K Das
  • J. Makhijani
  • M. K. Niranjan
  • M. Motwani
  • A. K. Sachan
  • A. Rajput
چکیده

Introduction to Distributed System Design, Google Code University, http://code. google. com/edu/parallel/dsd-tutorial. html#Basics D. Manivannan, R. H. B. Netzer & M. Singhal, "Finding Consistent Global Checkpoints in a Distributed Computation", IEEE Trans. On Parallel & Distributed Systems, Vol. 8, No. 6, pp. 623-627 (June 1997) J. Tsai & S. Kuo, "Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability"; IEEE Trans. On Parallel & Distributed Systems, Vol. 9, No. 10, pp. 963-971 (October 1998) B. Bhargava and S. R. Lian, "Independent Checkpointing and Concurrent

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Algorithm in Fault Tolerance for Electing Coordinator in Distributed Systems

The distributed computing systems, predominantly computing and computer based systems generally tolerate changes which are not desired, in their internal structure or external environment in regular working which can be referred to as faults. A Fault may be an operational fault or design fault. The techniques to tolerate the fault are used to make a system fault tolerable. Checkpointing is a te...

متن کامل

An Enhanced MSS-based checkpointing Scheme for Mobile Computing Environment

Mobile computing systems are made up of different components among which Mobile Support Stations (MSSs) play a key role. This paper proposes an efficient MSS-based non-blocking coordinated checkpointing scheme for mobile computing environment. In the scheme suggested nearly all aspects of checkpointing and their related overheads are forwarded to the MSSs and as a result the workload of Mobile ...

متن کامل

Stability Assessment Metamorphic Approach (SAMA) for Effective Scheduling based on Fault Tolerance in Computational Grid

Grid Computing allows coordinated and controlled resource sharing and problem solving in multi-institutional, dynamic virtual organizations. Moreover, fault tolerance and task scheduling is an important issue for large scale computational grid because of its unreliable nature of grid resources. Commonly exploited techniques to realize fault tolerance is periodic Checkpointing that periodically ...

متن کامل

Coordinated Checkpointing using Vector Timestamp in Grid Computing

In grid computing, system recovery is carried out using checkpoints recorded at each nodes. The resource manager must recover system with keeping global consistency to prevent Domino effect. Currently, coordinated checkpointing is widely used in which all processes can be synchronized. Considering overhead due to synchronization, we will present a coordinated checkpoint protocol using vector ti...

متن کامل

Defining the Checkpoint Interval for Uncoordinated Checkpointing Protocols

Parallel applications running on large computers suffer from the absence of a reliable environment. Fault tolerance proposals, in general, rely on rollback-recovery strategies supported by checkpoint and/or message logging. There are well-defined models that address the optimum checkpoint interval for coordinated checkpointing. Nevertheless, there is a lack of models concerning uncoordinated ch...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014